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Abstract 

In this two-part paper, we consider multicomponent systems in which each component can iteratively 
exchange information with other components in its neighborhood in order to compute, in a distributed fashion, 
the average of the components' initial values or some other quantity of interest (i.e., some function of these 
initial values). In particular, we study an iterative algorithm for computing the average of the initial values of 
the nodes. In this algorithm, each component maintains two sets of variables that are updated via two identical 
linear iterations. The average of the initial values of the nodes can be asymptotically computed by each node 
as the ratio of two of the variables it maintains. In the first part of this paper, we show how the update rules 
for the two sets of variables can be enhanced so that the algorithm becomes tolerant to communication links 
that may drop packets, independently among them and independently between different transmission times. In 
this second part, by rewriting the collective dynamics of both iterations, we show that the resulting system is 
mathematically equivalent to a finite inhomogenous Markov chain whose transition matrix takes one of finitely 
many values at each step. Then, by using e a coefficients of ergodicity approach, a method commonly used for 
convergence analysis of Markov chains, we prove convergence of the robustified consensus scheme. The analysis 
suggests that similar convergence should hold under more general conditions as well. 



Note to readers: Section U discusses the relation between Part II (this report) and the companion Part I of the report, 
and discusses some related work. The readers may skip Section |I] without a loss of continuity. 
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I. Introduction 

The focus of this paper is to analyze the convergence of the robustified double-iteratiorQ algorithm 
for average consensus introduced in Part I, utilizing a different framework that allows us to move 
away from the probabilistic model describing the availability of communication links of Part I. More 
specifically, instead of focusing on the dynamics of the first and second moments of the two iterations 
to establish convergence as done in Part I, we consider a framework that builds upon the theory of 
finite inhomogenous Markov chains. In this regard, by augmenting the communication graph, we will 
show that the collective dynamics of each of the two iterations can be rewritten in such a way that the 
resulting system is mathematically equivalent to a finite inhomogenous Markov chain whose transition 
matrix takes values from a finite set of possible matrices. Once the problem is recasted in this fashion, 
tools, such as coefficients of ergodicity, commonly used in the analysis of inhomogenous Markov chains 
(see, e.g., are used to prove the convergence of the algorithm. 

Recalling from Part I, when the communication network is perfectly reliable (i.e., in the absence 
of packet drops), the collective dynamics of the linear iterations can be described by a discrete-time 
transition system with no inputs in which the transition matrix is column stochastic and primitive. 
Then, each node runs two identical copies of a linear iteration, with each iteration initialized differently 
depending on the problem to be solved. This double-iteration algorithm is a particular instance of 
the algorithm in [2] (which is a generalization of the algorithm proposed in O), where the matrices 
describing each linear iteration are allowed to vary as time evolves, whereas in our setup (for the 
ideal case when there are no communication link failures) the transition matrix is fixed over time. In 
general, the algorithm described above is not robust against packet-dropping communication links. It 
might be possible to robustify it by introducing message delivery acknowledgment mechanisms and 
retransmission mechanisms, but this has certain overhead and drawbacks as discussed in Section III-CI 
Also, in a pure broadcast system, which is the communication model we assume in this work, it is easy 
to see that the double-iteration algorithm above will not work properly. The mechanism we proposed in 
Part I to robustify the double iteration algorithm was for each node i to keep track of three quantities 
of interest: i) its own internal state (as captured by the state variables maintained in the original double 
iteration scheme of [|2l, [|4l; ii) an auxiliary variable that accounts for the total mass broadcasted so far 
by node i to (all of) its neighbors; and iii) another auxiliary variable that accounts for the total received 
mass from each node j that sends information to node i. The details of the algorithm are provided 
in Section [nil but the key in analyzing convergence of the algorithm is to show that the collective 
system dynamics can be rewritten by introducing additional nodes — virtual buffers — that account for 
the difference between these two auxiliary variables. The resulting enhanced system is equivalent to an 
inhomogenous Markov chain whose transition matrix takes values from a finite set. 

As discussed in Part I, even if relying on the ratio of two linear iterations, our work is different from 
the work in [i2| in terms of both the communication model and also the nature of the protocol itself. 

'in this second part we will also refer to this algorithm as "ratio consensus" algorithm and will use both denominations interchangeably. 
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In this regard, a key premise in is that stochasticity of the transition matrix must be maintained 
over time, which requires sending nodes to know the number of nodes that are listening, suggesting 
that i) either the communication links are perfectly reliable, or ii) there is some acknowledgment and 
retransmission mechanism that ensures messages are delivered to the listening nodes at every round 
of information exchange. In our work, we remove both assumptions, and assume a pure broadcast 
model without acknowledgements and retransmissions. It is very easy to see that in the presence of 
lossy communication links, the algorithm in |l2l does not solve the average consensus problems as 
stochasticity of the transition matrix is not preserved over time. Thus, as mentioned above, the key in 
the approach we follow to analyze convergence is to augment the communication graph by introducing 
additional nodes, and to establish the correctness of the algorithms and establish that the collective 
dynamics of the resulting system is equivalent to a finite inhomogenous Markov chain with transition 
matrix that values values from a finite set. Once the system is rewritten in this fashion, the robust 
algorithm for ratio consensus reduces to a similar setting to the one in [|2l, except for the fact that some 
of the the resulting transition matrices might not have positive diagonals, which is required for the proof 
in Thus, in this regard, our approach may be also viewed as a generalization of the main result in 

m. 

The idea of augmenting the communication graph has been used in consensus problems to study the 
impact of bounded (fixed and random) communication delays (Si, jU, |I71. In our work, the augmented 
communication graph that results from rewriting the collective system dynamics has some similarities 
to the augmented communication graph in f7|, where the link from node i to node j is replaced by 
several paths from node i to node j, in order to mimic the effect of communication delays. In particular, 
in [|7]|, for a maximum delay of B steps, B paths are added in parallel with the single-edge path that 
captures the non-delayed message transmission. The added path corresponding to delay b {1 < b < B) 
has b nodes, for a total of B{B + l)/2 additional nodes capturing the effect of message transmission 
delays from node i to node j. At every time step, a message from node i to node j is randomly routed 
through one of these paths; the authors assume for simplicity that each of the paths is activated with 
probability 1/B. For large communication graphs, one of the drawbacks of this model is the explosion 
in the number of nodes to be added to the communication graph to model the effect of delays. In our 
work, for analysis purposes, we also use the idea of augmenting the communication graph, but in our 
case, a single parallel path is sufficient to capture the effect of packet-dropping communication links. 
As briefly discussed later, it is easy to see that our modeling formalism can also be used to capture 
random delays, with the advantage over the formalism in [|7]| that in our model, it is only necessary to 
add a single parallel path with B nodes (instead of the B{B + 1) /2 nodes added above) per link in the 
original communication path, which reduces the number of states added. Additionally, our modeling 
framework can handle any delay distribution, as long as the equivalent augmented network satisfies 
properties (M1)-(M5) discussed in Section HV-Ai 

In order to make Part II self-contained, we review several ideas already introduced in Part I, including 
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the double-iteration algorithm formulation over perfectly reliable networks and its robustified version. 
In Part II, we will embrace the common convention utilized in Markov chains of pre-multiplying the 
transition matrix of the Markov chain by the corresponding probability vector. 

The remainder of this paper is organized as follows. Section |ll] introduces the communication model, 
briefly describes the non-robust version of the double-iteration algorithm, and discusses some issues that 
arise when implementing the double-iteration algorithm in networks with unreliable links. Section [ni] 
describes the strategy to robustify the double-iteration algorithm against communication link failures. 
Section UV] reformulates each of the two iterations in the robust algorithm as an inhomogeneous Markov 
chain. We employ coefficients of ergodicity analysis to characterize the algorithm behavior in Section IVl 
Convergence of the robustified double-iteration algorithm is established in Section |Vll Concluding 
remarks and discussions on future work are presented in Section IVIII 

II. Preliminaries 

This section describes the communication model we adopt throughout the work, introduces nota- 
tion, reviews the double-iteration algorithm that can be used to solve consensus problems when the 
communication network is perfectly reliable, and discusses issues that arise when implementing the 
double-iteration algorithm in networks with packet-dropping links. 

A. Network Communication Model 

The system under consideration consists of a network of m nodes, V = {1,2,..., m}, each of which 
has some initial value Vi, i = 1,2, ...,m, (e.g., a temperature reading). The nodes need to reach 
consensus to the average of these initial values in an iterative fashion. In other words, the goal is for 
each node to obtain the value ^ in a distributed fashion. We assume a synchronous system in 
which time is divided into time steps of fixed duration. The nodes in the network are connected by a 
certain directed network. More specifically, a directed link (j, i) is said to "exist" if transmissions from 
node j can be received by node i infinitely often over an infinite interval. Let £ denote the set of all 
directed links that exist in the network. For notational convenience, we take that {i, i) G 8, Vz, so that a 
self-loop exists at each node. Then, graph Q = (V, S) represents the network connectivity. Let us define 

= {j I (j; ^ ^} = {j I (^'j) ^ Thus, Xj consists of all nodes from whom node i has 

incoming links, and Oi consists of all nodes to whom node i has outgoing links. For a set S, we will 
denote the cardinality of set S* by IS*]. The outdegree of node i, denoted as Di, is the size of set Oi, 
thus, Di = \Oi\. Due to the assumption that all nodes have self-loops, i G Xj and i G Oi, \/i G V. We 
assume that graph Q = iV,£) is strongly connected. Thus, in ^ = iV,£), there exists a directed path 
from any node i to any node j, G V (although it is possible that the links on such a path between 
a pair of nodes may not all be simultaneously reliable in a given time slot). 

^We later discuss how the techniques we develop for reaching consensus using the double iteration algorithm in the presence of 
packet-dropping links naturally lead to an asynchronous computation setup. 
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The iterative consensus algorithms considered in this paper assume that, at each step of the iteration, 
each node transmits some information to all the nodes to whom it has a reliable directed link during 
that iteration (or "time step"). The iterative consensus algorithm summarized in Section III-BI assumes 
the special case wherein all the links are always reliable (that is, all links are reliable in every time 
step). In Section Uni and beyond, we consider a network with potentially unreliable links. Our work on 
iterative consensus over unreliable links is motivated by the presence of such links in wireless networks. 
Suppose that the nodes in our network communicate over wireless links, with the node locations being 
fixed. In such a wireless network, each node should generally be able to communicate with the other 
nodes in its vicinity. However, such transmissions may not always be reliable, due to channel fading 
and interference from other sources. To make our subsequent discussion precise, we will assume that 
a link (?,j) exists (i.e., (i,j) G £) only if each transmission from i is successfully received by node 
i with probability qij (0 < qij < 1). We assume that successes of transmissions on different links are 
independent of each other; also, successes of different transmissions on any given link are independent 
of each other. As we will see, these independence assumptions can be partially relaxed but we adopt 
them at this point for simplicity. 

We assume that all transmissions from any node i are Z^roadca*?*!^ in the sense that, every node j, 
such that G £, may receive i's transmission with probability q^j independently between nodes 

and transmission steps. As seen later, this broadcast property can potentially be exploited to make 
communication more efficient, particularly when a given node i wants to send identical information to 
all the nodes in Oi. When node i broadcasts a message to its neighbors, the reliabilities of receptions 
at different nodes in Oi are mutually independent. Each node i is assumed to be aware of the value 
of Di (i.e., the number of nodes in Oi), and the identity of each node in set Xj. This information can 
be learned using neighbor discovery mechanisms used in wireless ad hoc or mesh networks. Note that 
node i does not necessarily know whether transmissions to nodes in Oi are successful. 

B. Ratio Consensus Algorithm in Perfectly Reliable Communication Networks 

In this section, we summarize a consensus algorithm for a special case of the above system, wherein 
all the links in the network are always reliable (that is, reliable in every time step). The "ratio consensus" 
algorithm presented here performs two iterative computations in parallel, with the solution of the 
consensus algorithm being asymptotically obtained as the ratio of the outcome of the two parallel 
iterations. We will refer to this approach as ratio consensus. In prior literature, similar approaches have 
also been called weighted consensus [|2l, [|3l. 

Each node i maintains at iteration k state variables yk[i] and Zk[i]. At each time step k, each node i 

'As elaborated later, the results in this paper can also be applied in networks wherein the transmissions are unicast (not broadcast). 
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updates its state variable as follows: 

ZkH] 



J2 yk-i[j] / , k>i, (1) 



^ Zk-i[j]/D, , fc>l, (2) 



where yo[j] = vj, Vj = 1, . . . , m, and zo[j] = 1, Vj = 1, . . . , m. 

To facilitate implementation of the above iterations, at time step k, each node i broadcasts a message 
containing values yk-i[i]/Di and Zk~i[i]/Di to each node in Oi, and awaits reception of a similar 
message from each node in X^. When node i has received, from each node j E li, a value (namely, 
yk-i[j]/Dj and Zk-i[i]/Dj) at step k, node i performs the above update of its state variables (by simply 
summing the corresponding values). Hereafter, we will use the phrase "message f" to mean "message 
containing value f". 

The above two iterations are represented in a matrix notation in ([3]) and (01), where i/k and are row 
vectors of size m, and M is an m x m primitive matri}<0, such that M[i,j] = 1 / Di if j E Oi and 
otherwise. Compactly, we show 

Vk = Vk-i M, k>l, (3) 

Zk = Zk-i M, k>l. (4) 

It is assumed that zo[j] = 1 and yo[j] = Vj are the initial values at each node j E V. Each node i 
calculates, at each time step k, the ratio 

r-i Vkli] 



For the transition matrix M, (a) M[i,j] > 0, and (b) for all i, J^j-^l'^^j] — ^- matrix that 
satisfies these two conditions is said to be a row stochastic matrix. It has been shown in flU that Vk[i] 
asymptotically converges to the average of the elements of y^, provided that M is primitive and row 
stochastic. That is, if M is a primitive row stochastic matrix, then 

1- r-i u- /^x 

hm Vkh] = — , Vz E V, (5) 

where m is the number of elements in vector yo. 

C. Implementation Aspects of Ratio Consensus Algorithm in the Presence of Unreliable Links 

Let us consider how we might implement iterations ([3]) and dH) in a wireless network. Since the 
treatment for the yk and Zk iterations is similar, let us focus on the yk iteration for now. Implementing 
([3]) requires that, at iteration k (to compute y^), node i should transmit message yk-i[i\ M[i,j] to each 

"*A finite square matrix A is said to be primitive if for some positive integer p, > 0, that is, A''[i,j] > 0, Vi, j. 
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node j G Oi. Conveniently, for all j G Oi, the values M[i,j] are identical, and equal to 1/Di. Thus, 
node i needs to send message to each node in Oi. Let us define 

fj.k\i] = yk~i[i\ /Di , k>l. 

In a wireless network, the two approaches described next may be used by node i to transmit message 
to all the nodes in Oi. 

Approach 1: In this approach, each node i ensures that its message is delivered reliably to 
all the nodes in Oi. One way to achieve this goal is as follows. Node i can broadcast the message 
jj.k[i] on the wireless channel, and then wait for acknowlegdements (ack) from all the nodes in Oi. If 
such acks are not received from all nodes in Oi within some timeout interval, then i can retransmit 
the message. This procedure will be repeated until acks are received from all the intended recipients of 
This procedure ensures that the message is received by each node in Oi reliably in each step k of 
the iteration. However, as an undesirable side-effect, the time required to guarantee the reliable delivery 
to all the neighboring nodes is not fixed. In fact, this time can be arbitrarily large with a non-zero 
probability, if each transmission on a link E S is reliable with probability qij < 1. Different nodes 
may require different amounts of time to reliably deliver their message to their intended recipients. 
Thus, if a fixed finite interval of time is allocated for each step k, then it becomes difficult to guarantee 
that the iterations will be always performed correctly (because some messages may not be delivered 
within the fixed time interval). 

Approach 2: Alternatively, each node i may just broadcast its message Hk[i] once in time step k, 
and hope that all the nodes in Oi receive it reliably. This approach has the advantage that each step 
of the iteration can be performed in a short (and predictable) time interval. However, it also has the 
undesirable property that all the nodes in Oi may not receive the message (due to link unreliability), 
and such nodes will not be able to update their state correctly. It is important to note that, since there 
are no acknowlegements being sent, a node i cannot immediately know whether a node j G Oi has 
received i's message or not. 

Considering the shortcomings of the above two approaches, it appears that an alternative solution is 
required. Our solution to the problem (to be introduced in Section Hill) is to maintain additional state 
at each node, and utilize this state to mitigate the detrimental impact of link unreliability. To put it 
differently, the additional state can be used to design an iterative consensus algorithm robust to link 
unreliability. In particular, the amount of state maintained by each node i is proportional to |Xj|. In a 
large scale wireless network (i.e, with large m) with nodes spread over large space, we would expect 
that for any node i, |Xj| « m. In such cases, the small increase in the amount of state is a justifiable 
cost to achieve robustness in presence of link unreliability. 

Although M[i,j] is identical (and equal to 1/D,i) for all j G Oi in our example above, this is not 
necessary. So long as M is a primitive row stochastic matrix, the above iteration will converge to the 
correct consensus value (provided that the transmissions are always reliable). Thus, it is possible that in 
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a given iteration, node i may want to send different messages to different nodes in Oj. This goal can be 
achieved by performing unicast operation to each node in Oj. In this situation as well, two approaches 
analogous to Approaches 1 and 2 may be used. The first approach would be to reliably deliver the unicast 
messages, using as many retransmissions as necessarys. The second approach may be to transmit each 
message just once. In both cases, it is possible that the iterations may not be performed correctly. To 
simplify the discussion in this paper, we assume that each node i needs to transmit identical message 
to the nodes in Oi. However, it is easy to extend the proposed scheme so that it is applicable to the 
more general scenario as well. 

III. ROBUSTIFICATION OF RATIO CONSENSUS ALGORITHM 

In this section, we present the proposed ratio consensus algorithm that is robust in presence of link 
unreliability. The correctness of the proposed algorithm is established in Section |VIl As before, each 
node maintains state variables [i] and [i] . Additional state maintained at each node will be defined 
soon. Iterative computation is performed to maintain and Zk. For brevity, we will focus on presenting 
the iterations for y^, but iterations for Zk are analogous, with the difference being in the initial state. 
The initial values of y and z are assumec]^ to satisfy the following conditions: 

1) yo[^] > 0, v^, 

2) zo[i] > 0, Vz, 

3) E.2o[^] >0. 

Our goal for the robust iterative consensus algorithm is to allow each node i to compute (asymptotically) 
the ratio 

With a suitable choice of y^li] and Zo[i], different functions may be calculated [A]. In particular, if the 
initial input of node i is denoted as Vi, then by setting yo[i] = WiVi and zq[i\ = Wi, where Wi > 0, Vi, 
the nodes can compute the weighted average with = 1, Vz G V, the nodes calculate average 

consensus. 

A. Intuition Behind the Robust Algorithm 

To aid our presentation, let us introduce the notion of "mass." The initial value yo[^] at node i is to 
be viewed as its initial mass. If node i sends a message v to another node j, that can be viewed as a 
"transfer" of an amount of mass equal to v to node j. With this viewpoint, it helps to think of each 
step k as being performed over a non-zero interval of time. Then, yk[{\ should be viewed as the mass 
at node i at the end of time step k (which is the same as the start of step k + 1). Thus, during step k, 
each node i transfers (perhaps unsuccessfully, due to unreliable links) some mass to nodes in Oi, the 

'The assumption that yo[i] > 0, Vi, can be relaxed, allowing for arbitrary values for yo[i\. 
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amount being a function of yk~i[i]. The mass yk[i] is the accumulation of the mass that i receives in 
messages from nodes in Xj during step k. 

Now, J^iVoi'^] is the total mass in the system initially. If we implement iteration ([3]) in the absence 
of packet drops, then for all iterations k 

i i 

That is, the total mass in the system remains constant. This invariant is maintained because M is a row 
stochastic matrix. However, if a message v sent by node i is not received by some node j E Oi, then 
the mass in that message is "lost," resulting in reduction of the total mass in the system. 

Our robust algorithm is motivated by the desire to avoid the loss of mass in the system, even in the 
presence of unreliable links. The proposed algorithm uses Approach 2 for transmission of messages. In 
particular, in our algorithm (and as in the original ratio consensus), at each step k, each node i wants 
to transfer jj,k[i] = yk-i[i]/Di amount of mass to each node in Oi. For this purpose, node i broadcastj^ 
message To make the algorithm robust, let us assume that, for each link E £, a "virtual 

buffer" is available to store the mass that is "undelivered" on the link. For each node j E Oi, there are 
two possibilities: 

(PI) Link is not reliable in slot k: In this case, message Hkli] is not received by node j. Node 
i believes that it has transferred the mass to j (and thus, i does not include that mass in its own 
state yk[i]), and at the same time, that mass is not received at node j, and therefore, not included in 
yk[j]- Therefore, let us view this missing mass as being "buffered on" link in a virtual buffer. 
The virtual buffer for each directed link will be viewed as a virtual node in the network. 
Thus, when link (i, j) is unreliable, the mass is transferred from node i to "node" (i, j), instead of 
being transferred to node j. Note that when link (i, j) is unreliable, node j neither receives mass 
directly from node i, nor from the virtual buffer j). 

(P2) Link (z,j) is reliable in slot k: In this case, message is received by node j. Thus, 

contributes to yk[j]- In addition, all the mass buffered in the virtual buffer (z,j) will also be 
received by node j, and this mass will also contribute to yk[j]- We will say that buffer (z,j) 
"releases" its mass to node j. 

We capture the above intuition by building an "augmented" network that contains all the nodes in 
V, and also contains additional virtual nodes, each virtual node corresponding to the virtual buffer for 
a link in 8. Let us denote the augmented networks by = (V", 8'^) where = VVJ 8 and 

8^^ = 81^ {((z, j), j) I (z, j) G ^} U {z, (z, j) I (z, j) E 8}. 

In case (P2) above, the mass sent by node z, and the mass released from the virtual buffer (z, j), both 

*In the more general case, node i may want to transfer different amounts of mass to different nodes in Oi. In this case, node i may 
send (unreliable) unicast messages to these neighbors. The treatment in this case will be quite similar to the restricted case assumed in 
our discussion, except that node i will need to separately track mass transfers to each of its out-neighbors. 
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contribute to the new state yk[j] at node j. In particular, it will suffice for node j to only know the sum 
of the mass being sent by node i at step k and the mass being released (if any) from buffer at 
step k. In reality, of course, there is no virtual buffer to hold the mass that has not been delivered yet. 
However, an equivalent mechanism can be implemented by introducing additional state at each node in 
V, which exploits the above observation. This is what we explain in the next section. 

B. Robust Ratio Consensus Algorithm 

We will mitigate the shortcomings of Approach 2 described in Section ITl-CI by changing our iterations 
to be tolerant to missing messages. The modified scheme has the following features: 

• Instead of transmitting message iJik[i\ = Vk-iyAl^i step k, each node i broadcasts at step k a 
message with value Yl^j=i ^^k[i], denoted as ak[i]- Thus, crfc[z] is the total mass that node i wants to 
transfer to each node in Oi through the first k steps. 

• Each node i maintains, in addition to state variables yk\i] and Zk[i], also a state variable Pfc[j, for 
each node j G X^; pk[j, i] is the total mass that node i has received either directly from node j, or 
via virtual buffer (j, «), through step k. 

The computation performed at node i at step /c > 1 is as follows. Note that a^li] = 0, Vi G V and 
Po[^,j] =0, \/{i,3)e£. 

ak[i\ = crk-i\i] + yk-i\i]/Di, (6) 
|. . I ak[j], if G S and message ak[j] is received by i from j at step k, 

I Pk~i[j, i], if (j, i) ^ S and no message is received by i from j at step k, 

yk[i] = ^{pk[j,i] - Pk~i[3,i])- (8) 

When link G £^ is reliable, Pk[j,i] becomes equal to ak[j]'- this is reasonable, because i receives 
any new mass sent by j at step k, as well as any mass released by buffer (j, i) at step k. On the 
other hand, when link is unreliable, then Pk[j,i] remains unchanged from the previous iteration, 
since no mass is received from j (either directly or via virtual buffer (j, i)). It follows that, the total 
new mass received by node i at step k, either from node j directly or via buffer is given by 

Pk[j,i] -Pfc-i[j,^], which explains ©El 

IV. Robust Algorithm Formulation as an Inhomogeneous Markov Chain 

In this section, we reformulate each iteration performed by the robust algorithm as an inhomogeneous 
Markov chain whose transition matrix takes values from a finite set of matrices. We will also discuss 
some properties of these matrices, and analyze the behavior of their products, which helps in establishing 
the convergence of the robustified ratio consensus algorithm. 

^As per the algorithm specified above, observe that the values of a and p increase monotonically with time. This can be a concern 
for a large number of steps in practical implementations. However, this concern can be mitigated by "resetting" these values, e.g., via 
the exchange of additional information between neighbors (for instance, by piggybacking cumulative acknowledgements, which will be 
delivered whenever the links operate reliably). 
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A. Matrix Representation of Each Individual Iteration 

The matrix representation is obtained by observing an equivalence between the iteration in 
([8]), and an iterative algorithm (to be introduced soon) defined on the augmented network described in 
Section ITlI-AI The vector state of the augmented network consists ofn = m+\£ \ elements, corresponding 
to the mass held by each of the m nodes, and the mass held by each of the \£\ virtual buffers: these n 
entities are represented by as many nodes in the augmented network. 

With a slight abuse of notation, let us denote by yk the state of the nodes in the augmented network 
Q"-. The vector yi^ for Q"- is an augmented version of y^ for Q. In addition to y^li] for each i E V, 
the augmented yk vector also includes elements yk[{hj)] for each (z, j) G S, with yQ[{i,j)] = 0§ Due 
to the manner in which the yfc[^]'s are updated, yk[i], i E V, are identical in the original network and 
the augmented network; therefore, we do not distinguish between them. We next translate the iterative 
algorithm in ©-([H]) into the matrix form 

yk = yk-iMk, (9) 

for appropriately row-stochastic matrices Mk (to be defined soon) that might vary as the algorithm 
progresses (but nevertheless take values from a finite set of possible matrices). 

Let us define an indicator variable Xk[j,i] for each link (j, i) E £ at each time step k as follows: 

. , I 1, if link (j,i) is reliable at time step k, ^.^^ 
Xk[j,i] = < . (10) 

I 0, otherwise. 

We will now reformulate the iteration ©-([8]) and show how, in fact, it can be described in matrix form 
as shown in Q, where the matrix transition matrix Mk is a function of the indicator variables defined 
in (fTOl) . First, by using the indicator variables at time step k, as defined in (flOl) . it follows from ^ that 

pk[j,i] = Xk[j,i]ak[j] + (1 - Xk[j,i])pk-i[j,i]. (11) 

Now, for k > 0, define h'k[j,i] = cr^fj] — Pk[j,i\ (thus t'ob)^] = 0). Then, it follows from ^ and (fTTI) 
that 

Uk[j,t] = (1 - Xk[j,t]) + l^k-l[J,^]^ , A; > 1. (12) 

Also, from Q and (fTTI) . it follows that ® can be rewritten as 

yk[^ = (^y^ + Uk^,[J,^]^ , k>l. (13) 

At every instant k that the link (j, i) is not reliable, it is easy to see that the variable Vkij, i] increases 
by an amount equal to the amount that node j wished to send to node i, but i never received due to 
the link failure. Similarly, at every instant k that the link is reliable, the variable i'k[j,i] becomes 

'^Similarly, zo[{i,j)] = 0. 
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zero and its value at A; — 1 is received by node i as can be seen in (fT3l) . Thus, from (fT2l) and (fT3l) . we 
can think of the variable Vklj, i] as the state of a virtual node that buffers the mass that node i does not 
receive from node j every time the link fails. It is important to note that the z/fc[j, i]'s are virtual 
variables (no node in V computes u^) that just result from combining, as explained above, variables 
that the nodes in V compute. The reason for doing this is that the resulting model is equivalent to an 
inhomogeneous Markov chain. This can be easily seen by stacking up (fT3] ) for all nodes indexed in 
V, i.e., the computing nodes, and (fT2)) for all virtual buffers (j, i), with G £, and rewriting the 
resulting expressions in matrix form, from where the expression in ^ results. 

B. Structure and Properties of the Matrices Mf^ 

Next, we discuss the sparsity structure of the M^'s and obtain their entries by inspection of (fT2)) 
and (fT3l) . Additionally, we will explore some properties of the M^'s that will be helpful in the analysis 
conducted in Section |V] for characterizing the behavior of each of the individual iterations. 

1) Structure of Mk-' Let us first define the entries in row i of matrix Mk that corresponds to i G V. 
For G £, there are two possibilities: Xk[i,j] = or Xk[i,j] = 1. If Xk[i,j] = 0, then the mass 
= UkA/Di that node i wants to send to node j is added to the virtual buffer («, j). Otherwise, no 
new mass from node i is added to buffer (i, j). Therefore, 

M4^,(z,j)] = (l-X4z,j])/A. (14) 
The above value is zero if link (i, j) is reliable at step k, and l/Di otherwise. Similarly, it follows that 

Mfc[z,j] =Xfc[z,j]/A, (15) 

which is zero whenever link (z, j) is unreliable at step k, and l/Di otherwise. Observe that for each 

3 e O,, 

Mfc[^,j] + Mfc[2,(^,j)] = l/A, (16) 

with, in fact, one of the two quantities zero and the other equal to l/Di. For («, j) ^ 8, it naturally 
follows that Mk[i,j] = 0. Similarly, 

Mk[i, (s,r)] = 0, whenever i ^ s and (s, r) G E. (17) 

Since \Oi\ = Di, all the elements in row i of matrix Mk add up to 1. 

Now define row of matrix Mk, which describes how the mass of the virtual buffer for 
(z,j) G S, gets distributed. When link (z,j) works reliably at time step k (i.e., Xk[i,j] = 1), all the 
mass buffered on link (z, j) is transferred to node j; otherwise, no mass is trasferred from buffer 
to node j and the buffer retains all its previous mass and increases it by a quantity equal to the mass 
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that node i fail to send to node j. These conditions are captured by defining entries as follows: 

Mk[{z,j),j] = Xk[i,j], (18) 

M,[{z,j),it,j)] = 1-X,[z,j]. (19) 

Also, for obvious reasons, 

= 0, yp^j, peV, (20) 

Mfc[(z,j),(s,r)] = 0, V(z,j)7^(s,r), {s,r)eS. (21) 

Clearly, all the entries of the row labeled (z, j) add up to 1, which results in Mk being a row stochastic 
matrix for all A; > 1. 

2) Properties of M^: Let us denote the set of all possible instances (depending on the values of the 
indicator variables Xk[i,j], E £, k > 1) of matrix as Ai. The matrices in the set Ai have 

the following properties: 
(Ml) The set M is finite. 

Each distinct matrix in Ai corresponds to different instantiations of the indicator variables defined 
in (flOl) , resulting in exactly 21^' distinct matrices in Ai. 
(M2) Each matrix in Ai is a finite-dimensional square row stochastic matrix. 

The number of rows of each matrix Mk G A^, as defined above, h n = m + \S\, which is finite. 
Also, from (fT4l) - (|2T]) . theses matrices are square row-stochastic matrices. 
(M3) Each positive element of any matrix in Ai is lower bounded by a positive constant. 

Let us denote this lower bound as c. Then, due to the manner in which matrices in Ai are 
constructed, we can define c to be the positive constant obtained as 

c = min M[z, jl. 

i,j,M \M&M,M[i,j]>0 

(M4) The matrix M^, /c > 0, may be chosen to be any matrix M E Ai with a non-zero probability. The 
choice of the transition matrix at each time step is independent and identically distributed (i.i.d.) 
due to the assumption that link failures are independent (between nodes and time steps). 
Explanation: The probability distribution on is a function of the probability distribution on the 
link reliability. In particular, if a certain M G AI is obtained when the links m 8' 8 are reliable, 
and the remaining links are unreliable, then the probability that = M is equal to 

^(i,j)&£' Qij ^(i,j)e£-£' i^-Qij)- (22) 

(M5) For each z G V, there exists a finite positive integer li such that it is possible to find li matrices in 
Ai (possibly with repetition) such that their product (in a chosen order) is a row stochastic matrix 
with the column that corresponds to node i containing strictly positive entries. 
This property states that, for each i G V, there exists a matrix T*, obtained as the product of U 
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matrices in Ai that has the following properties: 

T*[j,z] > 0, VjeV, (23) 
Tniji,j2),i] > 0, V(ji,j2)e^. (24) 

This follows from the fact that the underlying graph Q'^ is strongly connected (in fact, it can be 
easily shown that k < m). To simplify the presentation below, and due to the self-loops, we can 
take li to be equal to a constant /, for all i e V. However, it should be easy to see that the arguments 
below can be generalized to the case when the /j's may be different. 

We can also show that under our assumption for link failures, there exists a single matrix, say T*, 
which simultaneously satisfies the conditions in (|23])-(l24l) for all ? G V. When all the links in 
the network operate reliably, network QiV^S) is strongly connected (by assumption). Since Q is 
strongly connected, there is a directed path between every pair of nodes i and j, i.e., i, j G V . In 
the augmented network for each G 8, there is a link from node i to node (i, j), and a 
link from node (i, j) to node j. Thus, it should be clear that the augmented network Q'^ is strongly 
connected as well. Consider a spanning tree rooted at node 1, such that all the nodes mV = VVJS 
have a directed path towards node 1, and also a spanning tree in which all the nodes have directed 
paths /rom node 1. Choose that matrix, say M* G M., which corresponds to all the links on these 
two spanning trees, as well as self-loops at all i G V, being reliable. If the total number of links 
that are thus reliable is e, it should be obvious that {M*Y will contain only non-zero entries in 
columns corresponding to ? G V. Thus, / defined above may be chosen as e. There are several 
other ways of constructing T*, some of which may result in a smaller value of /. 

V. Ergodicity Analysis of Products of Matrices 

We will next analyze the ergodic behavior of \hQ forward product = M1M2 . . . = Hj^iMj, 
where Mj G Ai, Vj = 1,2, ... ,k. Informally defined, weak ergodicity of Tk obtains if the rows of 
Tfc tend to equalize as A; —t- 00. In this work, we focus on the weak ergodicity notion, and establish 
probabilistic statements pertaining the ergodic behavior of T^. The analysis builds upon a large body 
of literature on products of nonnegative matrices (see, e.g., [TJ for a comprehensive account). First, we 
introduce the basic toolkit adopted from [|8l, [0, [HI, and then use it to analyze the ergodicity of Tk. 

A. Some Results Pertaining Coefficients of Ergodicity 

Informally speaking, a coefficient of ergodicity of a matrix A characterizes how different two rows 
of A are. For a row stochastic matrix A, propeiO coefficients of ergodicity 6{A) and X{A) are defined 

'Any scalar function t( ) continuous on the set of ?i x n row stochastic matrices, which satisfies < r{A) < 1, is said to be a proper 
coefficient of ergodicity if t{A) = if and only if A — e^v, where e is the all-ones row vector, and w > is such that ve^ = 1 |T|. 
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as: 

6{A) := max max j] — ^[22, (25) 

j n,'t2 

X{A) := 1 -min Vmin(yl[zi,j],A[z2,j]). (26) 

It is easy to see that < 5{A) < 1 and < \{A) < 1, and that the rows are identical if and only if 
S{A) = 0. Additionally, X{A) = if and only if 6{A) = 0. 

The next result establishes a relation between the coefficient of ergodicity 5{-) of a product of row 
stochastic matrices, and the coefficients of ergodicity A(-) of the individual matrices defining the product. 
This result will be used in the proof of Lemma |2l It was established in |[8l and also follows from the 
more general statement of Theorem 4.8 in [jTl|. 

Proposition 1: For any p square row stochastic matrices Ai,A2,... Ap_i, Ap, 

S{A,A2---Ap^,Ap) < {U^zlX{A,))6{Ap) < UU KA,). (27) 

The result in (l27l) is particularly useful to infer ergodicity of a product of matrices from the ergodic 
properties of the individual matrices in the product. For example, if X{Ai) is less than 1 for all i, then 
5(^1^2 ■ ■ ■ Ap^iAp) will tend to zero as p — 00. We will next introduce an important class of matrices 
for which A(-) < 1. 

Definition 1: A matrix A is said to be a scrambling matrix, if \{A) < 1 [[T]|. 

In a scrambling matrix A, since \{A) < 1, for each pair of rows ii and i2, there exists a column j 
(which may depend on ii and Z2) such that y4[zi, j] > and y4[z2, j] > 0, and vice-versa. As a special 
case, if any one column of a row stochastic matrix A contains only non-zero entries, then A must be 
scrambling. 

B. Ergodicity Analysis of Iterations of the Robust Algorithm 

We next analyze the ergodic properties of the products of matrices that result from each of the 
iterations comprising our robust algorithm. Let us focus on just one of the iterations, say y^, as the 
treatment of the Zk iteration is identical. As described in Section |IVl the progress of the yk iteration 
can be recast as an inhomogeneous Markov chain 

yk = yk-iMk, k>l, (28) 

where E A4, VA;. As already discussed, the sequence of M^'s that will govern the progress of yk 
is determined by communication link availability. (|28l) . Defining = Iij=i Mj, we obtain: 

yk = yoMM---Mk 

= yoT^U^j = yoTk, k>l. (29) 

By convention, 11°^^ M; = / for any k > 1 (I denotes the n x n identity matrix). 
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Recalling the constant / defined in (M5), define Wk as follows, 

Wk = Uf^^k-Di+iMj, k>l, G M, (30) 

from where it follows that 

Tik = nj^i Wk, k > 1. (31) 

Observe that the set of time steps "covered" by Wi and Wj, i ^ j, are non-overlapping. It is also 
important to note for subsequent analysis that, since the M^'s are row stochastic matrices and the 
product of any number of row stochastic matrices is row stochastic, all the WkS, and T^'s are also row 
stochastic matrices. 

Lemma [21 will establish that as the number of iteration steps goes to infinity, the rows of the matrix 
tend to equalize. For proving Lemma [2l we need the result in Lemma [U stated below, which establishes 
that there exists a nonzero probability of choosing matrices in Ai such that the W^s as defined in (l30l) 
are scrambling. 

Lemma 1: There exist constants w > and d < 1 such that, with probability equal to w, X{Wk) < d 
for A; > 1, independently for different k. 

Proof: Each Wk matrix is a product of / matrices from the set Ai. The choice of the M^'s that form 
Wi and Wj is independent for i ^ j, since Wi and Wj "cover" non-overlapping intervals of time. Thus, 
under the i.i.d. assumption for selection of matrices from Ai (property (M4)), and property (M5), it 
follows that, with a non-zero probability (independently for Wk and Wk' for k ^ k'), matrix Wk for 
each k is scrambling. Let us denote by w the probability that Wk is scrambling. 

Let us define W as the set of all possible instances of Wk that are scrambling. The set W is finite 
because the set Ai is finite, and W is also non-empty (this follows from the discussion of (M5)). Let 
us define d as the tight upper bound on \{W), for W E W, i.e., 

d = max \(W). (32) 

Recall that \{A) for any scrambling matrix A is strictly less than 1. Since W is non-empty and finite, 
and contains only scrambling matrices, it follows that 

d<l. (33) 

■ 

Lemma 2: There exist constants a and /3(0<a<l,0</3<l) such that, with probability greater 
than (1 - a''), 5{Tk) < 13^ for k > 81 /w. 

Proof: Let k* = [j\ and A = /c - Ik*. Thus, < A < /. From ^ through (HB, observe that 

Tk = Tik*+A = Tik* ^f=i Mik*+j, 
where Tik* is the product of k* of Wj matrices, where 1 < j < k*. As per Lemma [H for each Wj, 
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the probability that X{Wj) < d < 1 is equal to w. Thus, the expected number of scrambling matrices 
among the k* matrices is wk*. Denote by S the actual number of scrambling Wj matrices among the 
k* matrices. Then the Chernoff lower tail bound tells us that, for any > 0, 



Let us choose = ^. Then, 



Fi{S < (1 - 0)^(5)} < e"^^^)'^'/^ 
Pr{5 < (1 - (j)){wk*)} < e-('"'^*)'^'/2_ 



Ft{S < {wk*)/2} < e-^-^'/s 
Fi{S > wk*/2} > 1 - e-"''^*/^ 



(34) 
(35) 



(36) 
(37) 



Thus, at least [wk* /2\ of the W matrices from the k* matrices forming T^* are scrambling (each with 
A value < d, by Lemma [B with probability greater than 1 — e""'*^*/^. Proposition [T] then implies that 

Since at least [wk*/2\ of the Wi's have X(Wi) < d with probability greater than 1 — e""'''*/^, and 
A(Mj) < 1, Vj, it follows that 



with probability exceeding 



Let us define a 



e 



and (3 

k* 



dfi. Now, if > 8//w, then if follows that k > 21, and 



wk 



k 




k 




> — 


1_ 




- 21 


wk* 








> 


2 






wk* 








> 


2 







4/ 
wk 



I wk I wk 
d\- 2 J < C? 8! 



(because < d < 1) 



(38) 
(39) 

(40) 
(41) 
(42) 

(43) 
(44) 
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Similarly, if A; > 8//u;, it follows that 

k* 



k 
1 

=^ 1 - e-"''^*/*^ > 1 - (47) 

By substituting dUl) and (gT]) into dMI) and ([391), respectively, the resuk follows. ■ 
Note that a and /3 in Lemma [2] are independent of time. The threshold on k for which Lemma [2] 
holds, namely k > 81 /w, can be improved by using better bounds in (l40l) and (l45l) . Knowing a smaller 
threshold on for which Lemma [21 holds can be beneficial in a practical implementation. In the above 
derivation for Lemma [2l we chose a somewhat loose threshold in order to maintain a simpler form 
for the probability expression (namely, 1 — a'') and also a simpler expression for the bound on 5{Tk) 
(namely, 13'^). 

Lemma 3: 5{Tk) converges almost surely to 0. 

Proof: For k > 81 /w, from Lemma H we have that Pr{5(Tfc) > (3^} < a^, < a < 1, < (3 < 1. 
Then, it is easy to see that > /^'^} ^ 8l/w + X^fc < Then, by the first Borel- 

Cantelli lemma, Pr{the event that S{Tk) > occurs infinitely often} = 0. Therefore, 6{Tk) converges 
to almost surely. ■ 



VL Convergence Analysis of Robustified Ratio Consensus Algorithm 

The analysis below shows that the ratio algorithm achieves asymptotic consensus correctly in the 
presence of the virtual nodes, even if diagonals of the transition matrices (M^'s) are not always strictly 
positive. A key consequence is that the value of Zk[i] is not necessarily greater from zero (at least not 
for all k), which creates some difficulty when calculating the ratio ykii]/ Zk[i]. As noted earlier, aside 
from these differences, our algorithm is similar to that analyzed in fTl. Our proof has some similarities 
to the proof in [[2l, with the differences accounting for our relaxed assumptions. 

By defining Zk in an analogous way as we defined state yk in Section |IVl the robustified version of 
the ratio consensus algorithm in ©-(HI) can be described in matrix form as 

Vk = Vk-i Mk, k>l, (48) 

Zk = Zk-i Mk, k>l, (49) 

where Mk e M, k > 1, yo\i] > 0, Vi, zo[i\ > 0, Vz, and Ej-^ob'] > 0, and yo[{i,j)] = zo[{i,j)] = 
0, V(i,j) G S. The same matrix Mk is used at step k of the iterations in (l48l) and (|49l) . however, 
Mk may vary over k. Recall that yk and Zk in (|48]) and (|49l ) have n elements, but only the first m 
elements correspond to computing nodes in the augmented network Q"; the remaining entries in yk and 
Zk correspond to virtual buffers. 
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The goal of the algorithm is for each computing node to obtain a consensus value defined as 



= (50) 



To achieve this goal, each node i eV calculates 



'^m = — pr, (51) 

whenever the denominator is large enough, i.e., whenever 

Zk\i] > IJ^, (52) 

for some constant ji > Q to be defined later. We will show that, for each z = 1, 2, . . . , m, the sequence 
TikiA thus calculated asymptotically converges to the desired consensus value vr*. To show this, we first 
establish that (|52l ) occurs infinitely often, thus computing nodes can calculate the ratio in (|5T| ) infinitely 
often. Then, we will show that as k goes to infinity, the sequence of ratio computations in (|5TI) will 
converge to the value in (l50l) . 

The convergence when Xlj^ob] = can be shown trivially. So let us now consider the case when 
Z^jl/ob] > 0' ^iid define new state variables yt and Zk for /c > as follows: 

m = Vz, (53) 

m = v^^^V^. (54) 

Thus, yo and zq are defined by normalizing and Zk. It follows that yo and zq are stochastic row vectors. 
Also, since our transition matrices are row stochastic, it follows that y^ and Zk are also stochastic vectors 
for all A; > 0. 

We assume that each node knows a lower bound on J2j zo[j], denoted by /i^. In typical scenarios, for 
all i E V, ZQ[i] will be positive, and, node i E V can use Zo[i] as a non-zero lower bound on J2j ^olJ] 
(thus, in general, the lower bound used by different nodes may not be identical). We also assume an 
upper bound, say fiy, on yo[j]. 

Let us define 

/i = . (55) 

n 

As time progresses, each node i E V will calculate a new estimate of the consensus value whenever 
Zkli] > /U- The next lemma establishes that nodes will can carry out this calculation infinitely often. 

Lemma 4: Let 71 = {r/, , ■ ■ ■ } denote the sequence of time instances when node i updates its 
estimate of the consensus using (|5TI) . and obeying (|52l) . where r/ < r/'^\ j > 1- The sequence 71 
contains infinitely many elements with probability 1. 
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Proof: To prove the lemma, it will suffice to prove that for infinitely many values of k, Zk\i\ > fi, 
with probability 1. Assumptions (M1)-(M5) imply that each matrix Wj, j > 1 (defined in (|30l) ) contains 
a strictly positive column corresponding to index i E V with a non-zero probability, say 7^ > 0. Also, 
the choice of Wk^ and is independent of each other for ki ^ k2. Therefore, the second Borel- 
Cantelli lemma implies that, with probability 1, for infinitely many values of j, Wj will have the i-th 
column strictly positive. Since the non-zero elements of each matrix in Ai are all greater or equal to c, 
c > (by property M3), and since Wj is a product of / matrices in M, it follows that all the non-zero 
elements of each Wj must be lower bounded by 

Consider only those j > I for which Wj contains positive i-th column. As noted above, there are 
infinitely many such j values. Now, 

As noted above, is a stochastic vector. Thus, for any k > 0, 

J2h[i] = 1 (56) 

i 

and, at least one of the elements of Z(j_i)i must be greater or equal to 1/n. Also, all the elements in 
columns of Wj indexed by z G V are lower bounded by c' (recall that we are now only considering 
those j for which the i-th column of Wj is positive). This implies that. 



> c^/n (57) 

> {j2^o[j?j c'/n (58) 

> f^zc'/n (59) 

> /i, Vz G V (by (El)) (60) 



Zjl[l 
Zji[l 

Zji[i 

Zji[l 

Since infinitely many Wj's will contain a positive i-th column (with probability 1), (l60l) holds for 
infinitely many j with probability 1. Therefore, with probability 1, the set 71 = {t/, r^^, ■ ' ' } contains 
infinitely many elements, for all i G V. ■ 
Finally, the next theorem shows that the ratio consensus algorithm will converge to the consensus 
value defined in (l50l) . 

Theorem 1: Let TXi\t] denote node i's estimate of the consensus value calculated at time r*. For each 
node i G V, with probability 1, the estimate Tii\t\ converges to 

Proof: Note that the transition matrices M^, A; > 1, are randomly drawn from a certain distribution. 
By an "execution" of the algorithm, we will mean a particular instance of the Mk sequence. Thus, the 
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distribution on M^s results in a distribution on the executions. Lemma [3] implies that, 



Pr I lim 6{Tk) = o| = 1. 



Together, Lemmas [3] and |4] imply that, with probability 1, for a chosen execution, (i) for any 'tp > 0, 
there exists a finite such that, for all k > k^,, 5{Tk) < tp, and (ii) there exist infinitely many values 
of k > k^ such that Zk[i] > /i (i.e., k E % for the chosen execution). 

Consider any k > k^ such that Zk\i\ > /i. Since 5{Tk) < ip, the rows of matrix are "within il)" of 
each other. Observe that yu is obtained as the product of stochastic row vector and T^. Thus, is 
in the convex hull of the rows of T^. Similarly Zk is in the convex hull of the rows of T^. Therefore, 
the j-th elements of t/k and Zk are within ip of each other, for all j. Therefore, 



ym 
yk[i] 
yk[i] 

Zk\i] 



yk[i\ - Zk[i\ 

Zk[i] 

Ej^ob] 
Ej^ob] 

Ej^ob] 
E,-2ob] 



< 
< 

< 
< 
< 



Zkii] 

^ Ej-yob] 

Zk[i] 



(by (|53]) and ^) 



(because E,-2/ob] < ^^y) 



(61) 
(62) 

(63) 
(64) 
(65) 



Now, given any e > 0, let us choose ip = efi/ fiy. Then (l65l) implies that 

yk[i] Eji/ob] 



Zkm 



Ei^^ob] 



< e 



E,'i/o[j] 



whenever k > Li, and k E Ti. Therefore, in the limit, ^44 for k E Ti converges to W — rr. This result 

— V « zk[t] ' => Ej^ob] 

holds with probability 1, since conditions (i) and (ii) stated above hold with probability 1. ■ 
The result above can be strengthened by proving convergence of the algorithm even if each node 
i E V updates its estimate whenever Zk[i] > (not necessarily > fi). To prove the convergence in this 
case, the argument is similar to that in Theorem [H with two modifications: 

• Lemma H] needs to be strengthened by observing that there exist infinitely many time instants at 
which Zk[i] > /i simultaneously for all i E V. This is true due to the existence of a matrix T* (as 
seen in the discussion of (M5)) that contains positive columns corresponding to all i eV. 

• Using the above observation, and the argument in the proof of Theorem [U it then follows that, 
with probability 1, for any ip, there exists a finite k^, such that 5{Tk) < ip whenever k > k^. As 
before, defining ^jj = e^/^y, it can be shown that for any e, there exists a k^ > k^p such that the 
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following inequality holds for all i G V simultaneously. 



< 



Zk, m 



< 



< 



e Vz G V 

h 



G V 



(66) 
(67) 



Naturally, Zk^ [i] 7^ 0, Vi G V. It is now easy to argue that the above inequality will continue to hold for 
all k > and each i G V whenever Zi[k] > 0. To see this, observe that, for k > k^. 



Vk = n; 



Define P = H^^^^ and = e^/ ^y. Then, we have that, whenever Zk[i] > for k > k^. 



Vkm 

Zk\i] 



(68) 



mm 



ZkM P[j^^] 



< 



< 



j,P[j,i]>0 Zk,[j] 

Ej2/ob] 
— e 

mM _ Ej^ob] 



yk[i\ 

Zk[i\ 

yk[i] 

Zk\i] 
< e 



(summation over non-zero P[j,i] terms) 



< max 



< 



j,P[j,i]>0 Zk,[j] 

^ — — - + e from (167 

Ej^^ob] 

for alH G V and k > h 



(69) 
(70) 

(71) 



This proves the convergence of the algorithm in the limit. Recall that for this convergence it suffices if 
each node updates its estimate of the consensus whenever its z value is positive. (|69l) follows from the 



observation that = Y^, 



m EfcMfc]«[fc] 



bounded by minj and upper bounded by maxj 



is a weighted average of 



and therefore, lower 



VIL Concluding Remarks and Future Work 



Although our analysis above is motivated by wireless environments wherein transmissions may not 
succeed, the analysis is more general. In particular, it applies to other situations in which properties 
(M1)-(M5) are true. Indeed, property (M4) by itself is not as important as its consequence that a given 
Wk matrix has non-zero columns indexed by ? G V. 

A particular application of the above analysis is in the case when messages may be delayed. As 
discussed previously, mass is transfered by any node to its neighbors by means of messages. Since 
these messages may be delayed, a message sent on link in slot k may be received by node j in 
a later slot. Let us denote by Vk[i] the set of messages received by node i at step k. It is possible for 
Vk[i] to contain multiple messages from the same node. Note that Vk[i] may contain a message sent by 
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node i to itself as well. Let us define the iteration for yk as follows: 

yk[i] = J2 (^2) 

veVk[i\ 

The iteration for Zk can be defined analogously. Our robust consensus algorithm essentially implements 
the above iteration, allowing for delays in delivery of mass on any link (caused by link failures). 
However, in effect, the robust algorithm also ensures FIFO (first-in-first-out) delivery, as follows. In 
slot k, if node i receives mass sent by node j E Ij in slot s, s < k, then mass sent by node j in slots 
strictly smaller than s is either received previously, or will be received in slot k. 

The virtual buffer mechanism essentially models asynchronous communication, wherein the messages 
between any pair of nodes in the network may require arbitrary delay, governed by some distribution. It 
is not difficult to see that the iterative algorithm (|72|) should be able to achieve consensus correctly even 
under other distributions on message delays, with possible correlation between the delays. In fact, it is 
also possible to tolerate non-FIFO (or out-of-order) message delivery provided that the delay distribution 
satisfies some reasonable constraints. Delay of up to B slots on a certain link (i, j) e £ can be modeled 
using a single chain of B virtual nodes, with links from node i to every virtual nodes, and link from the 
last of the B nodes to node j — in this setting, depending on the delay incurred by a packet, appropriate 
link from node i to one of the virtual node on the delay chain (or to j, if delay is 0) is used. 

Note that while we made certain assumptions regarding link failures, the analysis relies primarily on 
two implications of these assumptions, namely (i) the rows of the transition matrix become close to 
identical as k increases, and (ii) Zf^li] is bounded away from for each i infinitely often. When these 
implications are true, similar convergence results may hold in other environments. 
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